We've seen what perceptron learning cannot do and I've kind of told you where is it,
what no, not true.
So we've looked only at perceptron as computational machines, what functions they can actually
approximate.
And now we'll essentially go into learning.
How do we do that?
Well, what we do, and we're going to restrict ourselves for the moment to single output
perceptrons here.
So how would learning work?
Well, in this case, learning is just weight fitting.
And that here for single layer, single output perceptron networks is actually just multivariate
linear regression.
So we know the math.
We define the loss function to be squared error loss.
And I think there was a bracketing error there.
And then we do what we've always done and we have here the input equation and so we
have a very simple, in this case, weight update rule.
Looks essentially the same as we've had before and we can use that to adjust the weights
according to the inputs so that the empirical loss is actually minimized.
And then we can run it.
And then we can see what the error rates do.
I have here two charts.
One in red, we have the perceptron rule, and in green, we have decision tree learning.
And the example here is to find the majority function.
And in this case, we have 11 inputs, which is about the size of the inputs of the restaurant
data where we had, I think, 10 inputs and you can see that for the majority function,
the perceptron learns extremely well.
Majority is you take a couple of numbers and then find out which one the biggest is.
Very simple function.
And perceptron learns that almost possible after 100 examples.
And decision tree has its problems.
Presumably because you need big decision trees because it's kind of tightly coupled.
And in the restaurant example, you have the essentially opposite thing.
Restaurants work extremely well with decision trees.
Remember, we had all these information theoretic things there.
We have pruning and so on.
Whereas perceptrons are hopeless.
They aren't really getting better at all.
Even though, as we know, both of them really learn Boolean functions.
Really seems to depend on what kind of function we need.
And of course, the perceptron here, of course, is a single layer perceptron.
So we have a realisability problem here.
Okay.
Now we get into something we would nowadays call deep learning.
And we really scientifically would say multi-layer perceptrons.
So I have a very simple one.
Here we have an output unit.
We have the input unit.
And we have a couple of hidden units.
And the way you do these things is you choose the network by hand.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:16:59 Min
Aufnahmedatum
2021-03-30
Hochgeladen am
2021-03-30 17:16:40
Sprache
en-US
Perceptron learning compared to Decision Tree Learning. Introduction of Multilayer perceptrons and their expressiveness. Also, the Output Layer and Hidden Layers for Learning in Multilayer Networks are explained.